Scalable text classification as a tool for personalization
نویسندگان
چکیده
We consider scalability issues of the text classification problem where by using (multi)-labeled training documents, we try to build classifiers that assign documents into classes permitting classification in multiple classes. A new class of classification problems; called ‘scalable’, is introduced, with applications on web mining. Scalable classification utilizes newly classified instances in order to improve the accuracy of future classifications and capture changes in semantic representation of different topics. In addition, definition of different similarity classes is allowed, resulting in a ‘per-user’ classification procedure. Such an approach provides a newmethodology for building personalized applications. This is due to the fact that the user becomes a part of the classification procedure. We explore solutions for the scalable text classification problem and introduce an algorithm that exploits a new text analysis technique that decomposes documents into the vector representation of their sentences according to the user expertise. Finally, a web-based personalized news categorization system that bases upon this approach is presented.
منابع مشابه
Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملScalability of Text Classification
We explore scalability issues of the text classification problem where using (multi)labeled training documents we try to build classifiers that assign documents into classes permitting classification in multiple classes. A new class of classification problems, called ‘scalable’ is introduced that models many problems from the area of Web mining. The property of scalability is defined as the abi...
متن کاملCombining the Classifiers and Lsi Method for Efficient and Accurate Text Classification
Text classification involves assignment of predetermined categories to textual resources. Applications of text classification include recommendation systems. Personalization, help desk automation, content filtering and routing, selective alerting, and training. This paper describes an experiment for improving the classification accuracy of a large text corpus by the use of dimensionality reduct...
متن کاملAdaptive Information Analysis in Higher Education Institutes
Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...
متن کاملAdaptive Information Analysis in Higher Education Institutes
Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Comput. Syst. Sci. Eng.
دوره 24 شماره
صفحات -
تاریخ انتشار 2009